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Summary. In order to interpret and explain the physiological signal behaviors, it can be in- 
teresting to find some constants among the fluctuations of these data during all the effort or 
during different stages of the race (which can be detected using a change points detection 
method). Several recent papers have proposed the long-range dependence (Hurst) parameter 
as such a constant. However, their results induce two main problems. Firstly, DFA method 
is usually applied for estimating this parameter. Clearly, such a method does not provide the 
most efficient estimator and moreover it is not at all robust even in the case of smooth trends. 
Secondly, this method often gives estimated Hurst parameters larger than 1, which is the larger 
possible value for long memory stationary processes. In this article we propose solutions for 
both these problems and we define a new model allowing such estimated parameters. On the 
one hand, a wavelet-base estimator is applied to data. Such an estimator provides optimal 
convergence rates in a semiparametric context and can be used for smoothly trended pro- 
cesses. On the other hand, a new semiparametric model so-called locally fractional Gaussian 
noise is introduced and is characterized by a so-called parameter which can be larger than 
1. Such semiparametric process is tested to be relevant for modeling HR data in the three 
characteristic phases of the race. It also shows an evolution of the local fractality parameter 
during the race confirming the results obtained by Peng etal. (1995) in their study regarding 
Hurst parameter of HR time series during the exercise for healthy adults (where the estimated 
parameter is close to that observed in the race beginning) and heart failure adults (where the 
estimated parameter is close to that observed in the end of race). So, this evolution, which can 
not be observed with DFA method, may be associated with fatigue appearing during the last 
phase of the marathon. 

Keywords: Wavelet analysis; Detrended fluctuation analysis; Fractional Gaussian noise; Self- 
similarity; Hurst parameter; Long-range dependence processes; Heart rate time series 



1. Introduction 

The content of this article was motivated by a general study of physiological signals of run- 
ners recorded during endurance races as marathons. More precisely, after different signal 
procedures for "cleaning" data, one considers the time series resulting of the evolution of 
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heart rate (HR) data during the race. The following figure provides several examples of such 
data (recorded during Marathon of Paris). For each runner, the periods (in ms) between 
the successive pulsations (see Fig. Q]) are recorded. The HR signal in number of beats per 
minute (bpm) is then deduced (the HR average for the whole sample is of 162 bpm). 

This paper, focuses on the modeling and the estimation of relevant parameters charac- 
terizing these instantaneous heart rate signals of athletes recorded during the marathon. 
We have chosen to focus in an exponent that can be called "Fractal", which indicates the 
local regularity of the path and the dependency between data. In certain stationary cases, 
this parameter is close to the Hurst parameter, defined for long range dependent (LRD) 
processes. 
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Fig. 1. Heat rate signals of Athlete 1 in ms, Hertz and BPM (up), of Athletes 2, 3 and 4 in BPM 

(down) 



The LRD behavior is often seen on various data. This phenomenon was developed in 
many fields beginning with hydrology (Hurst, 1951), telecommunication, biomechanics and 
recently in economy and finance. A mathematical and signal processing methods have 
also noted the prese nce of LRD in tim e series describing the fluc tuations over time of 
physiological signals ( Goldberger f2001 ). Goldberger et al. (200$]). Indeed, n umerous 



au thors have studied heartbeat time series (see for instance Peng et al. f!993T ). (1995) 
or lAbsil et al. Tl 999l) and the model proposed to fit these data is a trended long memory 
process with an estimated Hurst parameter close to 1 (and sometimes more than 1). In this 
article, three improvements have been proposed to such a model: 

(a) Data are stepped in three differe nt stages which are detected using a change points 
detection method (see for instance Lavielle (1999h ) developed in Section 2. The main 



idea of the detection's method is to consider that the signal distribution depends on a 
vector of unknown characteristic parameters constituted by the mean and the variance. 
The different stages (beginning, middle and end of the race) and therefore the different 
vectors of parameters, which change at two unknown instants, are estimated. This 
first step is important since the model defined below fits well for each sub-series but 
not at all for the whole HR data, 
(b) The parameter H which is very interesting for interpreting and explaining the physio- 
logical signal behaviors, is estimating using two methods the DFA method and wavelet 
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analysis in Section 3. 

The DFA which is a version, for time series with trend, of the aggregated variance 



asymptotic properties of the DFA function and the deduced estimator of H are stud- 
ied in the case of fractional Gaussian noise and extended to a general class of sta- 
tionary semiparametric long-range dependent processes with or without trend. We 
have shown that DFA method is not as efficient to estimate the Hurst parameter of 
stationary long memory processes, than other methods such as log-periodogram (see 
for instance, Moulines and Soulier (2003) or wavelet analysis (see Abry et al. (1998), 
or Moulines et al. (2007)), which provide the optimal convergence rate (in sense of 
the minimax criterium). Moreover, if the DFA method is not at all robust in the case 
of polynomial trends, this is not such a case of the wavelet analysis method. Finally, 
a goodness-of-fit test can be deduce from the wavelet analysis method. Despite the 
popularity of DFA method in numerous papers concerning such physiological signals 
(see for instance, Absil et al. (1999), Ivanov et al. (2001), Peng et al. (1993), (1995)), 
it is therefore clearly more interesting to use the wavelet based estimator in view of 
estimate a "fractal" exponent of HR data. 

As a first model, the usual fractional Gaussian noise (FGN) is then proposed for 
modeling HR data. In such a context, the wavelet based estimator provides two re- 
sults. Firstly, the estimated parameter often exceeds the value 1, which is the largest 
possible value for a FGN. Secondly, even for the 3 different stages of the race, the 
goodness-of-fit test is always rejected, 
(c) In Section 4, we propose to model HR data, during each stage, with a generalization 
of fractional Gaussian noise, called locally fractional Gaussian noise. Such stationary 
process is built from a parameter called local fractality parameter which is a kind of 
Hurst parameter in a restricted band frequency (that may take values in E, and not 
only in (0, 1) as usual Hurst parameter). The estimation of local fractality parameter 
and also the construction of goodness-of-fit test can be made with wavelet analysis. 
We also show the relevance of model and an evolution of the parameter during the 
race, which confirms results obtained by other authors in their study regarding the 
distinguish of healthy from pathologic data (see Peng et al. (1995)). 

2. Abrupt change detection 

During effort, one or more phases can be observed in recorded HR series, which evolve and 
change differently from an athlete to another: the transition step, recorded between the race 
beginning and the stage of HR reached during the effort, the main stage during the exercise 
and an arrival phase until the race end. So, after cleaning the HR data (implying that only 
9 athlete HR data are now considered) an automatic detection of changes is applied to HR 
time series cutting its in different race phases - beginning, middle and end. 

The change po int detection method used here is developed by Lavielle (see for instance 
Lavielle f!999h ). The main idea is to consider that the signal distribution depends on a 
vector of unknown characteristic parameters in each stage. The different stages and there- 
fore the different vectors of parameters, change at unknown instants. For instance and it 
will be our choice, changes in mean and variance can be detected. Applied to the data, the 
change point detection method along these two phenomena distinguishes beginning, middle 
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and end of race. So, it may be possible to envisage a piecewise stationarity i.e. that the 
process is almost stationary on fixed time intervals and it remains the modeling of this 
stationary component. 



General principle of the method of change detection 

Assume that a sample of a time series (Y(i),i = 1, . . . ,n) is observed. Assume also that 
it exists r = (ti,T2, . . . , tr-_i) with = tq < n < T2 < ... < tk-\ < n = tk and such 
that for each j <E {1,2,..., K}, the distribution law of Y(i) is depending on a parameter 
9j e 6 C R d (with d € IN) for all Tj_i < i < tj. Therefore, K is the number of segments 
to be deduced starting from the series and r = (n, T2, . . . , tk-i) is the ordered change 
instants. Now, define a contrast function 

Ue{Y(T j + l),Y(T j + 2),...,Y(T j+1 )), 

of 6 e R d applied on each vector (Y(Tj + l),Y(Tj+2), . . . ,Y(T j+1 )) for all j € {0, 2, . . . , K- 
1}. A general example of such a contrast function is 

U e {Y(r 3 + 1),Y(tj +2),..., Y{r J+1 )) = -2 logL e (Y( Tj + 1), Y( Tj +2),..., Y( Tj+1 )), 

where Lg is the likelihood. Then, for all j £ {0, 2, . . . , K — 1}, define: 

9 3 = Argmin ^(F^- + 1), Y( Tj +2),..., Y(r j+1 )). 

Now, set: 

K-l 

G(n, . . . , tk_0 = U h Pte + 1), Y{jj + 2), . . . , y(r, +1 )) 

As a consequence, an estimator (ri, . . . ,tr:-i) can be defined as: 

. . . ,t K -i) = Argmin G(n, . . . ,t K -i)- (1) 

0<T 1 <T2<...<T K -i<n 



The principle of such method of estimation is very general (it can be also devoted to estimate 
abrupt change in polynomial trends) and different asymptotic behavior of the estimator 
(ri, . . . ,tk-i) can be deduced under general assumption on the time series Y (see for 
instance Bai Perron, 1998, Lavielle, 1999, or Moulines and Lavielle, 2000). 
For HR data, it is obvious that the beginning and the end of the race implies respectively an 
increasing (respectively decreasing) of the mean of HR. However, for avoiding all confusion 
linked for instance to the stress of the runner or other harmful noises, it was chosen to 
detect a change in mean and variance. 

Change detection in mean and variance 

Therefore, for all j € {0, 1, . . . , K — 1}, consider the following general model: 

Y(i) = fij + (TjEi for all i e {tj + 1, . . . , tj+i}, 

where 6j = (m,j,<jj) e R x (0,oo) and (ei) is a sequence of zero-mean random variables 
with unit variance. 
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In the case of changes in both mean and variance, and it is such a framework we consider 
for the heart rates series, a "natural" contrast function is defined by: 



U 6] (Y(t ]+ 1),...,Y(t ]+1 )) 



g {Y(l)-m 3 y 



-Tt+l 



an therefore the well-known estimator of Qj is: 



9 3 - = (m,-,^) = (-^— J2 Y(£), 



0"7 



Tj+l 



Now, the estimator (ri, . . . , tk-i) can be deduced from JT]). When the number of changes is 
unknown, the procedure is exactly the same except that the number of changes is estimated. 
Thus, a new contrast V is built by adding to the previous contrast U an increasing function 
depending on the change number K, i.e. more precisely, 



V(n,...,T K -uK) = G{ n) 



with f3 > 0. As a consequence, by minimizing V in t\ 
obtained which varies with the penalization parameter j3 
For HR data, the choice of pen(K) was K. Let Gk = G(ti 
we define ^ ^ 

Gi<i — Gki 



tk-i) + P x pen(JsT), 

,tk-i,K, an estimator K is 
jTrt-i), for if = K\,... ,K M ax 



K 



and k — Pi — Pi+i with i > 1. 



Then the retained K is the greatest value of Ki such that >> lj for j > i. 
Applied to the whole set HR data, the number of abrupt changes is estimated at 4 or 3. 
Three phases were selected to be studied, which are located in the beginning of the race, in 
the middle and in the end (see for example Fig. [2]). However for certain recorded signals 
the first or the last phase can not be distinguished probably for measurements reasons. 




Fig. 2. The estimated configuration of changes in a HR time series of an athlete 
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In order to unveil if a change of behavior of HR series was happened during these three 
detected phases of the marathon, we propose a new model for these sub-series characterized 
by a parameter H. Several common estimators of this parameter, so-called scaling behavior 
exponents, consist in performing a linear regression fit of a scale-dependent quantity versus 
the scale in a lo garithmic represen tation. This includes the Detrende d Fluctuation Analy sis 
(DFA) method Peng et al. (19941 ) and the wavelet analysis method Abrv et al. (200$ ). 



3. A fractional Gaussian noise for modeling HR series and the estimation of the 
Hurst parameter 

In this section, a first model, the fractional Gaussian noise, is proposed for modeling HR 
data. After a statistic study, one chooses to estimate the Hurst parameter with a wavelet 
based estimator instead of the DFA method (which is commonly used in physiologic papers 
despite its weak performances). Moreover, a test built from the wavelet based method shows 
the badness-of-fit of this model to the data. 



3.1. A first model: the fractional Gaussian noise 

When we observe entire or partial (during the three phases) HR time series, we remark that 
it exhibits a certain persistence and the related correlations decays very slowly with time 
what characterizes trajectories of a long memory Gaussian noise. Also, the distribution of 
data recorded during the phases leads as to suspect a Gaussian behavior in these data. Of 
course this is only an assumption and we can check it with tests considered for long range 
dependent processes. But in our case we will try to test whether a Gaussian process could 
model these data. Moreover, the aggregated signals (see for example Fig. [3]) present a 
certain regularity very close to that of fractional Brownian motion simulated trajectories 
with a parameter close to 1 (Fig. [5]). So, fractional Gaussian noise could be an appropriated 
model to HR series. 




Fig. 3. The self-similarity of the aggregated HR signals (representation of the aggregated HR fluctu- 
ations at 3 different time resolutions) 
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The following Figure |4] presents a comparison between the graphs of HR data during a stage 
(detected previously) and a fractional Gaussian noise (FGN in the sequel) with parameter 
H = 0.99 (see the definition above). Before using statistical tools for testing the similarities 
of both these graphs, let us remind some elements concerning the FGN. 




-2 1 1 1 1 1 1 1 1 1 
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Fig. 4. Comparison of HR data in the middle of race (Ath4) and generated FGN(H=0.99) trajectories 

The FGN is one of the most famous example of stationary long range dependent (LRD 
in the sequel) process. The LRD phenomenon was observed in many fields including 
telecommunication, hydrology, biomechanic, economy... A stationary second order process 
Y = {Y(k), k e JN} is said to be a LRD process if: 

^|r y (fc)|=oo with r Y (k) = E[F(0)Y(fc)] ■ 
fceiN 

Thus Y(k) is depending on Y(0) even if k is a very large lag. Another way for writing the 
LRD property is the following: 

r Y (k) ~ k 2H - 2 L(k) , as k -> oo, 

with L(k) a slowly varying function (i.e. \/t > 0, L(xt) / L(x) — ► 1 when x — > oo) and the 
Hurst parameter H € (5, 1). 

The LRD is closely related to the self-similarity concept. A process X — {X(t), t > 0} is 
so called a self-similar process with self-similarity exponent H, if Vc > 0: 

{X(ct)) t ^c H (X(t)) t . 

Now, if we consider the aggregated process {X(t), t > 0} defined by X(k) — Yl%=i 
with a LRD process Y, then under weak conditions (for instance Y is a Gaussian or a 
causal linear process), it can be proved that, roughly speaking, for k — > 00, the distribution 
of {X(t),t > k} is a self-similar distribution (see Doukhan et al, 2003, for more details). 
The FGN is an example of a LRD Gaussian process. More precisely, Y H = {Y H (k) 1 k € IN} 
is a FGN, when 

2 

r Y »(k) - + l\ 2H - 2\k\ 2H + \k - 1\ 2H ) Vfc G IN, 



8 Bardet et al. 

with H <E (0,1) and a 2 > 0. As a consequence, for H 6 (i, 1), a usual Taylor formula 
implies 

r YH (k) ~ a 2 H{2H - \)k 2H ~ 2 , when k -> oo. 

For a zero- mean FGN, the corresponding aggregated process, denoted here is so-called 
the fractional Brownian motion (FBM) and X H is a self-similar Gaussian process with 
self-similar parameter H and therefore satisfies, 

Yax(X H (k)) = a 2 \k\ 2H Vfc e IN 

(it can be even proved that X H is the only Gaussian self-similar process with stationary 
increments). It is obvious that Y H {k) — X H (k) — X H (k—l), the sequence of the increments 
of a FBM, is a FGN. 




Several generated trajectories of FGN and corresponding FBM are presented in Fig. [5] for 
different values of H. 

For testing if a HR path can be suitably model by a FGN, a first step consists in estimating 
H. Here we chose to use two estimators (but there exist many else, see for instance Doukhan 
et al., 2003) that are known to be unchanged to the presence of a possible trend. 



3.2. Two estimators of the Hurst parameter: DFA and wavelet based estimators 

For estimating H, a frequently used method in the case of physiological data processing is 
the Detrended Fluctu ation Analysis (DFA). The DFA method was introduced by Peng et 



al. iPeng et al. (1994 1. The DFA method is a version for trended time series of the method 



of aggregated variance used for long-memory stationary process. It consists briefly on: 

(a) Aggregated the process and divided it into windows with fixed length, 

(b) Detrended the process from a linear regression in each windows, 

(c) Computed the standard deviation of the residual errors (the DFA function) for all 
data, 
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(d) Estimated the coefficient of the power law from a log-log regression of the DFA function 
on the length of the chosen window (see Fig. [6]). 

After the first stage, the process is supposed to behave like a self-similar process with 
stationary increments added with a trend (see previously). The second stage is supposed 
to remove the trend. Finally, the third and fourth stages are the same than those of the 
aggregated method (for zero- mean stationary process). An example of the DFA method 
applied to a path of a FGN with different values of H is shown in Fig. [6l 




Fig. 6. Results of the DFA method and wavelet analysis applied to a path of a discretized FGN for 
different values of H = 0.2, 0.4, 0.5, 0.7, 0.8, with N = 10000 



In Bardet and Kammoun f20C)8h , the asymptotic properties of the DFA function in case of 



a FGN path (F(l), . . . , Y(N)) are studied. In such a case the estimator Hdfa converges to 
H with a non-optimal convergence rate (N 1 ' 3 instead of TV 1 / 2 reached for instance by max- 
imum likelihood estimator) . An extension of these results for a general class of stationary 
Gaussian LRD processes is also established. In this semiparametric frame, we have shown 
that the estimator Hdfa converges to H with an optimal convergence rate (following the 
minimax criteria) when an optimal length of windows is known. 

The processing of experimental data, and in particular physiological data, exhibits a major 
problem that is the nonstationarity of the signal. Hu et al. (2001) have studied different 
types of nonstationarity associated with examples of trends and deduced their effect on an 
added noise and the kind of competition who exists between this two signals. They have 
also explained (2002) the effe cts of three other types of no nstationarity, which are often 
encountered in real data. In iBardet and Kammoun (2008f ). we proved that Hdfa does 
not converge to H when a polynomial trend (with degree greater or equal to 1) or a piece- 
wise constant trend is added to a LRD process: the DFA method is clearly a non robust 
estimation of the Hurst parameter in case of trend. 



For improving this estimation at least for polynomial trended LRD process, a wavelet based 
estimator is now considered. This method has been introduced by Flandrin (1992) and was 
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developed by Abry et al. (2003) and Bardet et al. (2000). In Wesfreid et al. (2005), a 
multifractal analysis of HR time series is presented for trying to unveil their scaling law 
behavior using the Wavelet Transform Modulus Maxima (WTMM) method. 

Let ip : R — > R a function so-called the mother wavelet. Let (a, b) <G R*j_ x R and de- 
note A = (a, b) . Then define the family of functions ip\ by 

Parameters a and b are so-called the scale and the shift of the wavelet transform. Let us 
underline that we consider a continuous wavelet transform. Let dz(a,b) be the wavelet 
coefficient of the process Z = {Z(t), t e R} for the scale a and the shift b, with 

lft 

d z (a, b) = -= / V(- - b)Z(t)dt =< Va, Z >l'(b) ■ 

For a time series instead of a continuous time process, a Riemann sum can replace the 
previous integral for providing a discretized wavelet coefficient ez(a,b). The function ip is 
supposed to be a function such that it exists M S IN* satisfying , 

[ t m %P{t)dt = Q for all me {0,1,..., M}. (2) 

JR 

Therefore, ip has its M first vanishing moments. The wavelet based method can be applied 
to LRD or self-similar processes for respectively estimating the Hurst or the self-similarity 
parameter. This method is based on the following properties: for Z a stationary LRD 
process or a self-similar process having stationary increments, for all a > 0, (dz(a, b))ben is 
a zero-mean stationary process and 

• If Z is a stationary LRD process, 

E(4(a, b)) = V&r(d z (a, b)) - C(tp, H)^ 11 ' 1 when a -» oo 

• If Z is a self-similar process having stationary increments, 

E(4(a, b)) = Var(dz(a, b)) - K(ip, H)a 2H+1 for all a > 

with C(V>, H) and K (tp, H) two positive constants depending only on ip and H (those results 
are proved in Flandrin, 1992, Abry et al, 1998). Therefore, in both these cases, the variance 
of wavelet coefficients is a power law of a, and a log-log regression provides an estimator of 
H. From a path {Z(l), . . . , Z(N)), the estimator will be deduced from the log-log regression 
of the "natural" sample variance of discretized wavelet coefficients, i.e., 

1 WH 

SN{a) = W/a} 5> l(a ' i} - (3) 

A graph (logo^, log <SW(«i))i<i<^ is drawn from a priori family of scales and the slope of the 
least square regression line provides the estimator Hy/AV of H . In the semiparametric frame 
of a general class of stationary Gaussian LRD processes (more general than the previous 
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Table 1. Comparison of the two samples of estimations of H with 100 realizations of fGn path 
(N=1 0000) with DFA and wavelets methods (The p-val was deduced from comparison of the mean 
of each sample with theoretical one at the 5% level) 



HfGn 


\Biass 


\BiasQ 


p-val DFA 


p-val WAV 


VMSE DFA 


VMSE WAV 


0.50 


0.0064 


0.0071 


0.0152 


0.0983 


0.0271 


0.0433 


0.60 


0.0092 


0.0009 


0.0017 


0.8289 


0.0304 


0.0405 


0.70 


0.0141 


0.0015 


10 -5 


0.7342 


0.0347 


0.0436 


0.80 


0.0125 


0.0050 


0.0002 


0.1978 


0.0349 


0.0391 


0.90 


0.0179 


0.0062 


2 • 10 -6 


0.1030 


0.0407 


0.0448 



semiparametric context required for Hdfa), it was established by Moulines et al. (2007) 
that the estimator Hwav converges to H with an optimal convergence rate (following the 
minimax criteria) when an optimal length of windows is known. Thus, theoretical asymp- 
totic behaviors of Hdfa and Hwav are comparable for FGN and a semiparametrical class 
of LRD Gaussian processes (however more general for Hwav)- 

This is not true any more when a polynomial trended LRD (or self-similar) processes is 
considered. Indeed, Abry et al. (1998) remarked that every degree M polynomial trend is 
without effects on Hwav since ip has its M first vanishing moments. Therefore, the larger 
M, the more robust Hwav is. 

Finally, Bardet (2002) established a chi-squared goodness-of-fit test for a path of FBM 
(therefore for aggregated FGN) using wavelet analysis. This test is based on a (penalized) 
distance between the points (logai,logSV(ai))i<j<^ and a pseudo-generalized least square 
regression line (here the scales ai are selected to behave as TV 1 / 3 ). 

In the Table Q] appear the different estimations of H computed from the DFA and wavelet 
analysis methods for 100 realizations of FGN paths with TV = 10000. We choose for these 
simulations the concrete procedure of wavelet analysis developed by Abry et al. (2003) (a 
Daubechies wavelet is chosen and a Mallat's fast pyramidal algorithm is used to compute 
wavelet coefficients). In one hand, the wavelets method appear slightly more effective than 
DFA method considering the p-value which is very low for the sample of the DFA estima- 
tions compared to wavelet analysis estimations. This is essentially due to the estimator 
bias which is more important in the case of DFA. In the other hand, if we consider the root 
of MSE which is the sum of the squared bias and the variance, the DFA estimator seems 
to be slightly more effective. Note that for FGN processes (without trend), the Whittle 
maximum likelihood estimator of H gives a "better" results (see Taqqu et al, 1999). 

3.3. Application of both the estimators to HR data 

Both these estimators of H can also be applied to the HR time series of the 9 athletes. The 
following figures Fig. [7] and Fig. [8] exhibit examples of applications of both the estimation 
method to HR data. For each athlete, it was first done to the whole time series, and then to 
the different phases of the race (as it was obtained from the detection of abrupt changes, see 
Section [2]). The estimation results of H, for the different signals observed during the three 
phases of the race, are recapitulated in the Table [2] using wavelets method and in Table [3] 
using DFA method. 
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Fig. 7. Two first steps of the DFA method applied to a HR series (up) and results of the DFA method 
applied to HR series for two different athletes (down) 



Log. of chosen scales 



Log. of chosen scales 



Fig. 8. The log-log graph of the variance of wavelet coefficients relating to the HR series observed 
during the race and in the end of race (Ath2) 



Two main problems result from these different estimations. First, Hdfa and Hwav are 
often larger than 1. However, the FGN is only defined for H € (0, 1). For defining a process 
allowing H > 1, three main assumptions of FGN have to be changed: 

(a) the assumption that the process is a stationary process; 

(b) the assumption that the process is a Gaussian process; 

(c) the assumption that only two parameters (H and a 2 ) are sufficient to define the 
process. 
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Table 2. Estimated H with wavelets methods for HR se- 
ries of different athletes 



Phases 





HR series 


Beginning 


Middle 


Race end 


Athl 


0.8931 


1.1268 


1.1064 


1.2773 


Ath2 


1.1174 


0.7871 


1.0916 


0.8472 


Ath3 


1.0208 


1.0315 


1.1797 




Ath4 


0.9273 




1.0407 


0.7925 


Ath5 


1.0986 


1.3110 


1.0113 


1.3952 


Ath6 


1.0769 


1.5020 


1.1597 


1.3673 


Ath7 


1.0654 


1.4237 


1.1766 


1.0151 


Ath8 


0.9568 


1.6600 


0.9699 


1.1948 


Ath9 


0.9379 


1.5791 


0.9877 


0.7263 



In the sequel (see below), a new model is proposed. Both the first assumptions are still 
satisfied and the third one is replaced by a semiparametric assumption. 

The second problem is implied by the results of the goodness-of-fit test (for wavelet analysis 
method). Indeed, this test is never accepted as well for the whole time series as for the par- 
tial times series. An explanation of such a phenomenon can be deduced from Figure [8l for 
the wavelet analysis, the points (logai,logSW(ai))i<j<£ are clearly lined for a, < a m , but 
not exactly lined for a,i> a m . Thus the HR time series seems to nearly behave like a FGN 
for "small" scales (or high frequencies), but not for "large" scales (or small frequencies). A 
process following this conclusion can not be the better fit of HR time series... 

Remark: this last conclusion leads also to a clear advantage of wavelet based over DFA 
estimator. Indeed, the DFA algorithm measures only one exponent characterizing the entire 
signal. Then, this method corresponds rather to the study of "monofractal" signals such 
as FGN. At the contrary, the wavelet method provides the graph (logaj,logSjv(aj))i<,<^ 
which can be very interesting for analyze the multifractal behavior of data (see also Billat 
et al, 2005). 

4. A new model for modeling HR data: a locally fractional Gaussian noise 

4. 1. The locally fractional Gaussian noise 

In Bardet and Bertrand (2007), a generalization of the FBM, so-called the (Mk )-multiscale 
FBM, was introduced. The (Mo)-FBM is a FBM with self-similarity parameter Hq. Roughly 
speaking, the (Mr-)-FBM has the same harmonizable representation (and therefore quite the 
same behavior as the FBM) than a FBM with self-similarity parameter Hi for frequencies 
|£| S [ui,w i+1 [ for all i = 0,...,K (K G IN). For instance, a (Mi) -FBM behaves as 
a FBM with self-similarity parameter Hq for small frequencies and as a FBM with self- 
similarity parameter Hi for high frequencies. Such a model was fruitfully used for modeling 
biomechanical signals (position of the center of pressure on a force platform during quiet 
postural stance measured at a frequency of 100 Hz for the one minute period). 

Here, Fig. [9]suggests than a fitted model for aggregated HR data should behave like a FBM 
with self-similarity parameter H for low frequencies and differently for high frequencies 
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Fig. 9. The log-log graph of the variance of wavelet coefficients relating to the HR series observed 
during the arrival phase (Ath6) with a frequency band of [0.01 12](right) and of [0.2 4](left). 



(and not necessary like a FBM). Thus define a locally fractional Brownian motion X p 
{X p (t), t e JR} as the process such that: 



x P (t) = 



,m _ i 



In P(0 

where the function p : R —* [0, oo) is an even continuous function such that: 
• p (f) = I |£|»+i/2 f or |£| e [cj 0)U;i ] with H e R, a > and < uj < wi 



(iAiei : 



< oo. 



and W(d£) is a Brownian measure and W(d£) its Fourier transform in the distribution 
meaning. Cramer and Leadbetter (1967) proved the existence of such Gaussian process 
with stationary increments. The main advantages of such process compared to usual FBM 
are the following: 

(a) X p "behaves" like a FBM only for local band of frequencies; 

(b) In this band, the parameter H is not restricted to be in (0, 1): it is in R. 

From this definition, one deduces a possible model for HR data: 

e 4 *« sin(£/2) 



Y p (t)=X p (t + l)-X p (t) = 2-1le( 



p(0 



w(do) 



for t e R. 



Note that Y p = {Y p (t),t <E R} is a stationary Gaussian process and the function 2 sin(£/2)p 
is so-called the spectral density of Y p . 
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Let An — > and NAn — > oo when — * oo. The wavelet based estimator can provide a 
convergent estimation of H when a path 

(Y p (A N ),Y p (2A N ),...,Y p (NA N )) 

and therefore a path (X p (An), ■ ■ ■ , X p (NAn)) is observed. Indeed, consider a "mother" 
wavelet ip such that ip : R i— > R is a C°° function satisfying : 



• for all s > 0, / |t s ^(t)| < oo; 

./R 



• its Fourier transform ^/>(£) is an even function compactly supported on [— (3, — a]U[a, f3] 
with < a < (3. 

a /3 

Then, using results of Bardet and Bertrand (2007), for all a > such that [— , — J C [cJo, oji], 
(3 a 

i.e. a € [ — , — 1, (dx a (a, 6))6eR is a stationary Gaussian process and 



E(d 2 Xp (a,.))=K(iP,H,a)-a 



2H+1 



with K (V>, H, a) > only depending on ip, H and a. However this property is checked if 
and only if the function ip is chosen such that: 

t < ^ 

Moreover, for a S [ — , — ], the sample variance Sn{o) defined in J3]) and computed from a 

path (Xp(Ajv), . . . , X p (NAn)) converges to F,(d Xp (a, .)) and satisfies a central limit theo- 
rem with convergence rate \/NAn. Thus, with fixed scales (a\, . . . , af) G [^-, ^] e , a log- 
log-regression of (at, SN(ai))i<i<£ provides an estimation of H (and a central limit theorem 
with convergence rate NAn satisfied by Hwav can also be established). As previously, 
we consider also chi-squared goodness-of-fit test based on the wavelet analysis and defined 
as a weighted distance between points (log(ai), log(SW(ai)))i<i<£ and a pseudo-generalized 
regression line. 

Remark: The main problem with these estimator and test is the localization of the suitable 
frequency band [wo, w i] (<^o and u>\ are assumed to be unknown parameters). A solution 
consists in selecting a very large band of scales and determining then graphically the "most" 
linear part of the set of points (log(ai), log(SW(ai)))i<;<<>. Another possible way may be to 
compute an adaptive estimator of this band using a quadratic criterion (following a similar 
procedure than in Bardet and Bertrand, 2007). Here, like 9 different paths of HR data are 
observed, a common frequency band [ZZJo,aJT] can be graphically obtained and used for whole 
HR data (see above). 



4.2. Application to HR data 

First, one considers that a HR time series (^(1), ■ ■ ■ , Y(n)) can be written (Y p (A n ), Y p (2A n ), 
. . . ,Y p (n A„)), Y p = {Y p (t) } t e R} a process defined as previously. Secondly, the wavelet 
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Table 3. Estimated H, with DFA and wavelets methods, for HR series of different athletes 
(*) The series for which the test is rejected. Comparison of the two samples (H D fa)i,...,9 
and (H WA v)i,...,9 for whole and partial series (p-value) 





HR series 


Race be; 


ginning 


During 


the race 


End of race 


Hdfa 


Hwav 


Hdfa 


Hwav 


Hdfa 


Hwav 


Hdfa 


Hwav 


Athl 


0.928 


1.288* 


1.032 


1.192 


1.060 


1.214 


0.429 


1.400 


Ath2 


1.095 


1.268* 


0.905 


0.973 


1.126 


1.108 


1.240 


1.452* 


Ath3 


1.163 


1.048 


0.553 


0.898 


1.130 


1.172 






Ath4 


1.193 


0.916* 






1.098 


1.249* 


1.172 


1.260 


Ath5 


1.239 


1.110 


1.267 


1.117* 


1.133 


1.205 


1.273 


1.348* 


Ath6 


1.247 


1.084* 


1.237 


1.106 


1.091 


1.172 


1.436 


1.338 


Ath7 


1.155 


1.095 


0.850 


1.295 


1.182 


1.186* 


1.129 


1.209 


Ath8 


1.258 


1.011 


1.304 


1.128* 


0.995 


1.134 


1.122 


1.247 


Ath9 


1.243 


1.429* 


0.820 


1.019 


1.127 


1.535* 


1.250 


1.238* 



p-value 0.6414 0.3723 0.0225 0.1260 



F-stat 0.23 0.85 6.38 2.65 



analysis is applied to the 9 (whole or partial) HR time series (the chosen "mother" wavelet 
is a kind of Lemarie-Meyer wavelet such that /3 = 2a). Using first a very large band of 
scales for all HR time series (for example [0.01, 12] in Fig. [10]), one estimation of frequency 
band is deduced: \uJq, ujT] = [0.2,4] is the chosen frequency band for the whole and partial 
signals. 




-6 -4-2 2 -3 -2 -1 1 

Logarithm of the chosen frequencies (Hz) Logarithm of the chosen frequencies (Hz) 



Fig. 10. The log-log graph of the variance of wavelet coefficients relating to the HR series observed 
in the middle of the exercise (Ath5) 

The estimation results of H, for the different signals observed during the three phases of the 
race, are recapitulated in the Table [3l 



Both DFA and wavelet analysis methods provide estimations of Hurst exponent which re- 
flect the possible modeling of HR data with long range dependence time series. 
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We also note that with a p-value of 0.64, both the samples {Hjjfa)i,...,9 and {Hwav)i,...,9 
obtained from all HR time series are significantly close. 

The same comparison can also be done when the three characteristic stages of the race (be- 
ginning, middle and end of the race) are distinguished. The result is different. Indeed, the 
corresponding p- values between {Hdfa)x,...,9 and (Hwav)i,...,9 are significatively different 
in the middle part of the race (and relatively different in the stage of race end) . 

In spite of values relating to the estimator of H for all the athletes in the different phases 
which are relatively large, the DFA has sometimes tendency to under estimating this param- 
eter like in the race beginning (Ath3) and the end of race (Athl). Indeed, these value are 
clearly due to a certain trend supports by the fact t hat data points in log-log plot (Fig. [II]) 
have not a straight line form, and we have proved in lBardet and Kammoun (2008T ) that the 
DFA method is not robust in the case of trended long range dependent process. However in 
both the cases, the wavelets method is more effective since it removes sufficiently this kind 
of trend. 



i yo 




Fig. 1 1 . The results of the DFA method applied to records for race beginning (Ath3) (left) and for end 
of race (Ath1) (right) 



For HR data and when the goodness-of-fit test is accepted, the wavelet method shows a 
fractal parameter H close to 1. According to the different studies (using DFA method) 
about physiologic time se ries for distinguish i ng healthy from p athologic data sets (see 
Goldberger et al. f2002f) . IPeng et al. (19951 ). IPeng et al. (1993T )). an exponent H ~ 1 



indicate a healthy cardiac HR time series. Indeed, for the study concerning a 24 hours 
recorded interbeat time series during the exercise for healthy adults and heart failure adults, 
the following results are obtained: for healthy subjects, H — 1.01 ± 0.16, for the group of 
heart failure subjects H = 1.24 ± 0.22. 

During the different stages of the marathon race, a small increase of the fractal param- 
eter H is observed especially at the end of races. This behavior and this evolution may be 
associated with fatigue appearing during the last phase of the marathon. This evolution 
can not be observed with DFA method. Indeed, in one hand, when we observe the three 
9-samples of wavelet estimators (related to the 3 phases of the race), the p-value (see Fig. 
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[T2|) indicates a significantly difference due precisely to this evolution of the fractal param- 
eter. On the other hand, a large p-value (0.85) is obtained for the same test using DFA 
estimation. 



Boxplot of the three samples of estimations by applying wavelet analysis 



HdFA Hy/AV 



p-value 0.8570 0.0158 
F-stat 0.16 5.27 



Race beginning 



During the race 



Fig. 12. Comparison of the three samples constituting by estimations in the beginning of race, during 
the race and then in the race end by the DFA and wavelet methods 



The representation given by Fig. [12l highlight a difference in the behaviors of HR series in 
the beginning of the race and in the end of race. Indeed, the dispersions in the first and 
last sample are more important than in the middle of race and it seems that each athlete 
starts and finished the race at his own rhythm but in the middle athletes seems to have the 
same rate. 



5. Conclusion 

As indicated in the beginning of the last section, our main goal is to see whether the heart 
rate time series during the race have specific properties that of scaling law behavior. The 
wavelet analysis and the DFA methods are applied to 9 HR time series during the whole 
and also the different three phases of the race (beginning, middle and end of race) obtained 
by an automatic procedure. Even if their results are not exactly the same, both methods 
provide Hurst exp onents which reflect the possib le modeling of HR data by a LRD time 
series. However, in lBardet and Kammoun (2008), even if the DFA estimator of Hurst pa- 



rameter is proved to be convergent with a reasonable convergence rate for LRD stationary 
Gaussian processes, it is not at all a robust method in case of trend. The wavelet based 
method provides a more precise and robust estimator of the Hurst parameter. Thus, the 
results obtained from this wavelet estimator seem to be more valid. 



Moreover, a chi-squared goodness-of-fit test can also be deduced from this method. It 
seems to show that a classical LRD stationary Gaussian process is not exactly a suitable 
model for HR data. Graphs obtained with wavelet analysis also show that a locally frac- 
tional Gaussian noise, a semiparametric process defined in Section [3] could be more relevant 
to model these data. A chi-squared test confirms the goodness-of-fit of such a model. Thus, 
using the wavelet estimation of a fractal parameter in a specific frequency band, one obtains 
a conclusion relatively close to those obtained by other studies (conclusion which can not 
be detected with DFA method): these fractal parameters increase through the race phases, 
what may be explained with fatigue appearing during the last phase of the marathon. Thus 
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this fractal parameter may be a relevant factor to detect a change during a long-distance 
race. 

Finally, for the 9 athletes and as the test is validated with significance level around 0.65, we 
can estimate Hbegiming at 1.1, the H m iddie at 1.2 and H enc i at 1.3 with a larger confidence 
interval at the beginning and the end of the race. This behavior could bring a new way of 
understanding what is happening during a race. 
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